Need and Role of Scala Implementations in Bioinformatics
نویسندگان
چکیده
Next Generation Sequencing has resulted in the generation of large number of omics data at a faster speed that was not possible before. This data is only useful if it can be stored and analyzed at the same speed. Big Data platforms and tools like Apache Hadoop and Spark has solved this problem. However, most of the algorithms used in bioinformatics for Pairwise alignment, Multiple Alignment and Motif finding are not implemented for Hadoop or Spark. Scala is a powerful language supported by Spark. It provides, constructs like traits, closures, functions, pattern matching and extractors that make it suitable for Bioinformatics applications. This article explores the Bioinformatics areas where Scala can be used efficiently for data analysis. It also highlights the need for Scala implementation of algorithms used in Bioinformatics. Keywords—Scala; Big Data; Hadoop; Spark; Next Generation Sequencing; Genomics; RNA; DNA; Bioinformatics
منابع مشابه
Scala Roles - A Lightweight Approach Towards Reusable Collaborations
Purely class-based implementations of object-oriented software are often inappropriate for reuse. In contrast, the notion of objects playing roles in a collaboration has been proven to be a valuable reuse abstraction. However, existing solutions to enable role-based programming tend to require vast extensions of the underlying programming language, and thus, are difficult to use in every day wo...
متن کاملAn Open Framework for Extensible Multi-stage Bioinformatics Software
In research labs, there is often a need to customise software at every step in a given bioinformatics workflow, but traditionally it has been difficult to obtain both a high degree of customisability and good performance. Performance-sensitive tools are often highly monolithic, which can make research difficult. We present a novel set of software development principles and a bioinformatics fram...
متن کاملDeciphering the functional role of hypothetical proteins from Chloroflexus aurantiacs J-10-f1 using bioinformatics approach
Chloroflexus aurantiacus J-10-f1 is an anoxygenic, photosynthetic, facultative autotrophic gram negative bacterium found from hot spring at a temperature range of 50-60°C. It can sustain itself in dark only if oxygen is available thereby exhibiting a dark orange color, however display a dark green color when grown in sunlight. Genome of the organism contains total of 3853 proteins out ...
متن کاملTwo Approaches to Portable Macros
For any programming language that supports macros and has multiple implementations (each with dierent AST definitions), there is a common problem: how to make macros that operate on ASTs portable among dierent compiler implementations? Implementing portable macros is especially important for statically typed languages like Scala, as IDE vendors usually have dierent implementations of the lan...
متن کاملBioinformatics to Biostochastics: Statistical Perspectives and Tasks Ahead
Bioinformatics is an emerging field of science emphasizing the application of mathematics, statistics, and informatics to study and analysis of very large molecular biological (mostly, genetic and genomic) systems (data sets). In a comparatively broader setup of large biological systems without necessarily having a predominant genetic undercurrent, and having genesis in biometry to biostatistic...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2017